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Coaparative analyses of educational aethods have 
nconclusive; there are several possible explanations: (1) 
is optiaally effective under specifically differing 
s; (2) the studies tend to eaphasize one input variable 
sion of the others; (3) the aethods do not have widely 
initions; and (4) the aethods are often tested before 
fected. Coaparative studies could be enhanced by aore 
efining the aethods that are being investigated and by 
g the laboratory techniques and observation scheaes that 
• In addition, the studies could be aaide aore usable to 
y spelling out what standards of educational 
s the researcher has eaployed. To obtain an answer for a 
parison in a designated ailieu, carefully controlled 
be Bounted using siailar populations aiaed at siailar 
1 goals and exposed to two differing treataents« (EMH) 
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Introduccion 

The recent decade or so might be called 
(among other names) the Age of Consumerism* 
Attention has been turned to the purchaser or 
user of services and products. It is becoming 
clear that being a discriminating consumer is not 
an easy job. Even the first step — formulating 
the most useful consumer questions — requires 
sophistication and training. The simple, straight- 
forward questions often overlay unrecognized 
complex problem areas. 

What do these consumer questions have in 
common ; 

The patient asks his dentist: "Is it better 
to fill a decayed tooth or extract it?" 

The client asks the lawyer: "Is it better 
to take a case to trial or to settle out of court?" 

The parent asks the child psychologist: 
"Is reward more effective than punishment?" 

One common feature is that they all are 
likely to elicit the same answer from the expert: 

"It all depends ..." 

The educational researcher faces a similar 
problem when he is asked (or he asks himself) 
which instructional technique ("method" or "treat- 
ment") is better* For example, "Is computer- 
assisted instruction superior to the traditional 
lecture method?" Consumers of education as well 
as researchers have for years asked this compara- 
tive question about a great variety of teaching 
methods. This paper attempts to analyze the 
question, to suggest when it is useful and when 
not, and to show how it must^ be reformulated to 
serve some purposes. 
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The comparatiive study, that is re- 
search that compares one teaching method 
with another, has had a long history. 
Extensive summaries of this methodological 
literature appear periodically. Some refer 
to a particula^r.^method (e.g., "Learning 
^' in Discussion^': A resum§ of the Author- 
\ itarian — Democratic studies."!) Some 
^ refer to comparisons of any alternative 
treatment* with the so-called traditional 
methodology (e.g.. The Teaching-Learning 
Paradox : A Comparative Analysis of Col ^ 
lege Teaching Methods )^. In every case 
the ambiguity of results described by 
these vast surveys suggests the futility 
of continuing to search for a simple 
answer at this point, or at least suggests 
deferring additional research of this sort 
until there is sufficient clarification 
of the question. 

The usual conclusion reached upon 
completion of one of these surveys is: 
there is no clear answer; no clear evi- 
dence of superiority; no real basis for 
choosing one method or another. Further- 
more these reviews present a similar 
research history for each innovative 
method studied. (i) The originators of 
a new method show its dramatically great- 
er effectiveness; (2) follow-up studies, 
some by acolytes of the innovators, begin 
to cast doubt on those first findings; 
(3) later studies often fail to show 
differences between treatments. And 
so in answer to the simple comparison 
question the expert reviewer is driven 
to emit that irritating phrase: "It 
depends ..." 

Tfie irritation may be reduced if 
we understand some of the reasons for " — 
the failure of research in this area 
including the self-defeating nature of 
the question. Furthermore as we come to 
understand those reasons we can develop 
better research strategies for getting 
at some of the questions comparative 
studies aim to answer. 

Educational researchers Richard C. ' 
Anderson and Gerald Faust^ comment: 

In the abstract and without 
qualifications, it is very 

* We will use "treatment" to refer 
broadly to method, strategy or 
• technique of instruction. 



difficult to prove that one 
method of instruction is better 
than another. About all a single 
experiment can show is that one 
particular lesson ior set of 
lessons) is more effective than 
a comparison lesson. Suppose 
that a hypothetical experiment 
shows that students learn more 
from a series of lectures than 
from several textbook chapters 
covering the same material. ' 
Obviously one should not con- 
clude from this study that all 
lectures are better than all 
textbooks. Nor should one 
conclude that the difference 
in achievement is due to 
superiority of the spoken word 
over the printed word. It may 
have been the case, for example 
that the lecturer expressed 
concepts more clearly or used 
more informative examples than 
the textbook, \or that the per- 
centage of students that attend- 
ed class and listened to the 
lectures was greater than the 
percentage that actually read 
the textbook. (page 4) 

. , « There are many explana- 
tions for the inconclusive 
results ;of studies of teaching 
methods . . . first there is 
the obvious point that differ- 
ent methods may work best with 
different students, different 
purposes, and under different 
conditions. 

Second, there has been con- 
fusion about what is meant by 
a given method. Various in- 
vestigators, for example, have 
defined methods differently. 
Naturally, variations in method 
could lead to variable results 
with students . . . (page 5) 

. . . Third, each method empha- 
sizes a few processes and 
features of instruction but 
ignores others. In most 'methods 
studies* important factors have 
been overlooked. 

Finally — and, we believe, most 

important in the majority of 

the studies — the teaching methods 
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under consideration haye not 
been very good. Few invest- 
igators have taken pains to 
develop a method until i^^was 
demonstrably effective. Fre- 
quently/ comparisions among 
methods have been premature ..." 
(page 5) 

Let us examine closely two problems 
implied in the above comments. Basically 
they are problems of definition. They 
are concerned with the referents in the 
sentence "Is Treatment A better than 
Treatifieht B? Specifically, the key 
problems — important contributors of the 
failure of comparative research — are the 
definitions of the methods employed 
(technically: the experimental independ- 
ent variable) and of the word better 
(technically: the experimental depend- 
ent variable) . 

Defining the Method . 

Very often an educational method is 
described solely in terms of its formal 
properties, for example. Computer Assist-- 
ed Instruction. Apparently any instruct- 
ion which is assisted by a computer 
qualifies as CAI. But surely special- 
ists working in this area have other, 
more critical, characteristics in mind 
when they say CAI, for example: self - 
pacing and immediate feedback during 
instruction . 

We would not think of asking a 
physician either "how effective are 
pills?" or "Are pills more effective 
than injections?" because we intuitive- 
ly know that the formal description of 
the medication — the description in terms 
of how it looks — is almost irrelevant. 
Important are such things as what is in 
the pill and what is wrong with us when 
we take the pill. 

This is a key problem that vexes the 
educational researcher who is asked a 
question about the comparative effective- 
ness of methods since the methods are 
usually inadequately defined. Identify- 
ing obvious physical characteristics 
(e.g., a pill) is not sufficient. Many 
things which share those characteristics 
may have quite different effects. 

The referent problem becomes more 
obscured in some cases than others. At 



least with CAI the defining character- 
istic (i.e., the use of a computer) 
is objective. When in doubt the 
observer could call the Computing Centre 
and ask someone whether the object in 
question may properly be called a com- 
puter. But take the case of programmed 
instruction , or, lecture-discussion , 
or, modularized instruction ^ It is 
difficult to obtain agreement on whether 
or not a given piece of instructional 
material is or is not a program or a 
module, whether a class session was or 
was not really a lecture instead of a 
lecture-discussion. 

Before one can determine whether 
one instance (a text) is a good exemplar 
of a class (e.g.^ programmed instruction) 
he or she should be able to state the 
common attributes o£ members of the class. 
Disagreement among observers at this point 
almost precludes further exploration of 
the original problem (i.e., "Does it 
work?") 

Conversely, definitions^ especial- 
ly those based on formal properties may 
exclude instances that ought to be in- 
cluded. Methods of instruction that 
seem to be quite different on the face 
of it may share critical functional 
attributes. Looking only at the out- 
side, the package, as it were, is likely 
to be highly misleading. 

The standardization problem . 

A pill may not be like another 
pill/ except in terms of Itrivial, 
external attributes. But; one aspirin 
tablet, at least most of the pharma- 
ceutical advertisements tell us, is 
quite like another. There is a good 
degree of standardization in drugs as 
well as in many other product areas. 
Here our medical analogy breaks down. 
Instructional "products" differ widely 
even when called by the same name and 
even when they share many common char- 
acteristics. 

Even though there may be agree- 
ment on the basis of objectively 
determined attributes that two in- 
stances are of the same class, there 
still are major differences between 
the two. Faced with two introductory 
text books both on the same subject, 
both of about the same length and aimed 
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at the same reading level, both with 
study questions, do-it-yourself ex- 
periments etc., the professor still 
is likely to label one as the better 
text. One speaker will seem superior 
to another lecturer in an invited lecture 
series. 

The general questions "Does it work?" 
and "Is it better?" are unanswerable 
given the great differences between any 
two its one might choose. To put it 
another way: Generalizations about the 
effectiveness of any "method" must be 
severely limited. Since there are likely 
to be large differences among the in- 
stances of the same method, the results 
of a study involving one instance or even 
a few instances probably will not apply 
universally to all instances of that 
method. 

It should be clear that comparisons 
between one instance of one method and 
one instance of another (e.g., my lecture 
versus your module) are especially likely 
to be specious. 

The problem of definition becomes 
transparent, does it not, when one term 
of the comparison, for example, the 
traditional method , is so poorly defined 
that two examples of the traditional 
method could be used to re-^form the 
question? Thus, the question: is lecture 
better than CAI? becomes almost meaning- 
less if we know that professor A's 
traditional lecture is better than pro- 
fessor B*s traditional lecture and such 
is usually^ the case. 

If we can agree that within the 
term lecture, or programmed text, or film 
one could find the full range from very 
effective to totally ineffective examples 
in terms of teaching effectiveness, then 
obviously in a comparison between this 
method and some other much more explicit 
definition of the term is needed. Other- 
wise it is likely that in one study 
Lecture A will prove to be superior to 
CAI Lesson Type 1. While in a second 
study CAI Lesson Type 2 will produce 
better results than Lecture B. And 
that kind of paradox seems to permeate 
the iT.Gthod comparison literature, as we 
pointed out earlier. 



Summary 

The problems raised are not trivial 
ones or academic nit-picking. Almost 
every researcher who has reviewed" a^ set 
of comparison studies echoes the words 
of Robert Hohn.^ 

Inadequate description of the 
experimental techniques as well 
as control conditions, is per- 
haps the greatest deficiency in 
the recent literature on teach- 
ing innovation. A large majority 
of the thirty-one studies reviewed 
for this paper which compared more 
than one strategy of teaching pro- 
vided incomplete information about 
both treatments employed. The 
typical procedure is to character- 
ize a particular treatment with a 
label such as "lecture", "tradition- 
al", "self paced" or "group" with 
little or no data or operational 
terms used to clarify what part- 
icular interaction was occurring : 
within these groups. (page 3) 

The criterion problem . 

Now let us look at the other half 
of the original questions, phrases like: 
"is better than" or "works well." What 
precisely does better mean? 

Success (i.e., it worksL). can 
refer to as many different criteria 
as there are observers. An instruct- 
ional technique might be viewed as 
successful if it produces the design- 
ated changes in students for less cost, 
or in less time, or while simultaneously 
producing more favourable student at- 
titudes toward the instruction. It may 
be judged as successful if it allows 
previously excluded groups to succeed 
in the instructional system, or provides 
enrichment for those students who are 
moving ahead rapidly. Success may mean 
that the method "impresses" students, 
faculty, government, alumni, etc. if 
it presents the image of progress 
(even though no evidence may have been 
gathered about its instructional effect- 
iveness) . 

When we are deciding on the degree of 
success, do we plan to look merely at 
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one criterion (cost, teaching effective- 
ness) or' at a wide set of outcomes;? 

Suppose we choose to look at effect-^ 
iveness in terms of student achievement. 
In most studies all that is reported are 
gross suiTunaries of achievement test scores. 
What do these "class quizzes" or "final 
grades" represent? They mean little if 
we do not know what the objectives of the 
instructional treatments were. One thing 
we do know is that it is very likely that 
difTerent instructional objectives may 
best be reached by using different 
"methods." (A leading educational psych- 
ologist, Robert Gagn§ has recently devoted 
an entire book to exploring this very 
topic, contending throughout that there 
are optional methods for each general type 
of educational objective.)^ For example 
it is likely that the best way to learn 
a large number of simple associations 
(e.g., the names of the bones of the 
foot) is not the best way to learn high 
level decision making skills (e.g., deter- 
mining at which point in a. river the ad- 
dition of a dam would have the fewest 
negati /e effects on the ecological system) . 

In developing criteria for success 
one is faced with making explicit the 
general and specific purposes of the 
evaluation. If the decision to be made 
about a new innovation is going to be 
primarily based on problems of funding, 
then, obviously, data on cost ought to be 
gathered. If the observer is interested 
in the cognitive impact of an instruct- 
ional method, he would not attach too 
much weight to the student attitude data 
supplied by the innovator. If the 
observer is interested in higher level 
cognitive objectives, results from class 
"tests on fact-recall should not impress 
him. 

Are comparison studies valid ? 

We have just stated that evaluation 
is purposeful. The question: "Is Treat- 
ment A superior to Treatment B?" may be 
either a research question having some 
theoretical significance or it may be a 
practical question, the answer to which 
will help to make decisions about funding, 
curriculum, etc. 

It is important ^t the outset to 
determine what the purpose of the ques- 
tion is. This paper has contended that 



the simple question "Is Treatment A 
superior to Treatment B," is not 
sophisticated enough for those en- 
gaged in educational research design- 
ed to produce ^broad^, .generally applic- 
able "laws." 

Specific consumer use . 

However, it may be sufficiently 
sophisticated for the consumer to ask 
when choosing between two specific 
instructional items . If the consumer 
is asking, is Textbook A better than 
Textbook B, given a particular set of 
students and a particular set of in- 
structional objectives to be measured 
by an established evaluation system, 
then the question is appropriate. 
But the consumer must be forewarned 
first that the results and conclusions 
of his study will be as valid as the 
rigor with which the research has been 
conducted and secondly that the answer 
will be an extremely speciilic one. He 
will learn in a properly carried out 
study which textbook is better accord- 
ing to his definition of "better." 
He will not be justified in concluding 
any thing generally about texts as a 
method. 

For the individual professor in- 
terested in improving his own course, 
it probably is not worthwhile to 
become involved in either complex and 
sophisticated research in education 
nor to be overly concerned with the 
method and treatment questions. The 
best strategy might be to specify 
his own course goals and construct 
some means of determining how well his 
students are progressing toward those 
goals. Then he might choose any in- 
novative method which has some degree 
of legitimacy and with which he feels 
comfortable. Using broad guidelines 
for developing more effective instruct- 
ion^ can produce closer and closer 
approximations as he continues to make 
changes and observe the effects of 
those changes. This smacks of tinker- 
ing and it lacks elegance; the professor 
may not be able to contribute to the 
(insignificant) literature involving 
the comparison of treatments but he 
may end up with a much improved course. 
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The search ought to continue for 
generalizations which go far beyond the 
specific question: Should I use Text A 
or B? This is primarily a task for 
educational researchers who are likely 
to re-word the comparison question as 
stated at the oiiuset of this article. 
Instead of comparing "methods" described 
in terms of formal properties, the 
contemporary researcher will try to 
isolate critical functions in teaching 
methods (e.g., feedback to students, 
structure or content) and study varia- 
tions in these. Such work is going on 
apace in Psychology and Education. 

Decision-making . 

If research cannot tell us, ^ 
la Madison avenue, whether Brand A is 
better than Brand B, how can we make a 
decision when faced with such questions 
as: Should the University (the community, 
district, province) invest its resources 
in a particular innovation or teaching 
method? That question is being asked 
with increasing frequency today through- 
out the educational system. This article 
should have indicated that there is no 
easy answer. To obtain an answer for a 
specific comparison in a designated milieu 
carefully controlled studies can be mount- 
ed using similar populations, aimed at 
similar instructional goals -and exposed 
to two differing treatments. 

It is not merely the lack of data 
base that makes our original question 
unanswerable. Until recently we have 
lacked an adequately sophisticated 
engineering "science." In the gap 
between research and application — 
between the laboratory and the class- 
room — there is growing up an area of 
applied study which is likely to develop 
more and better means by which we can 
better define our questions and find ways 
of answering them. 
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